Percolation transition and distribution of connected 
components in generalized random network 
ensembles 

Serena Bradde^ and Ginestra Bianconi^ 

^ International School for Advanced Studies, via Beirut 2/4, 34014, Trieste, Italy and 
INFN, Via Valerio 2, Trieste, Italy 

^ The Abdus Salam International Center for Theoretical Physics, Strada Costiera 11, 
34014, Trieste, Italy 

E-mail: bradde@sissa.it, gbiancon@ictp.it 

Abstract. In this work, we study the percolation transition and large deviation 
properties of generalized canonical network ensembles. This new type of random 
networks might have a very rich complex structure, including high heterogeneous 
degree sequences, non-trivial community structure or specific spatial dependence of 
the link probability for networks embedded in a metric space. We find the cluster 
distribution of the networks in these ensembles by mapping the problem to a fully 
connected Potts model with heterogeneous couplings. We show that the nature of the 
Potts model phase transition, linked to the birth of a giant component, has a crossover 
from second to first order when the number of critical colors qc = 2 in all the networks 
under study. These results shed light on the properties of dynamical processes defined 
on these network ensembles. 
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1. Introduction 

Recently the study of critical phenomena in complex networks has attracted a great 
deal of interest [1]. . One of the main critical phenomena occurring in networks 
is the percolation transition which is a continuous structural phase transition that 
can be characterized by critical indices as a statistical mechanics second-order phase 
transition. This phase transition determines the robustness properties of complex 
networks [21 El HI E] and the critical temperature of the Ising P, [3, E] and XY models 
[9|, [To] on complex networks. Moreover, the onset of a percolating cluster determines a 
transition in between a phase in which small loops are suppressed and a phase in which 
the expectation value of small loops is positive in the limit of large network sizes [11] . 

The percolation phase transition in Erdos and Renyi networks is a classic subject 
of graph theory . For this network ensembles the large deviation of the number of 
connected components (or clusters) has been characterized by a mapping of the 
problem to a fully connected Potts model [H]. 

In uncorrelated complex networks, characterized by a non-Poisson degree 
distribution, the percolation transition depends on the second moment of the degree 
distribution [H [5] and can show non trivial critical exponents [1] . 

This phase transition has been also studied in directed networks [T5] and in networks 
with degree-degree correlations |16j . 

In this paper we study the percolation properties and the large deviation of the 
cluster distribution of the recently proposed generalized canonical random network 
ensembles [T7[ IT8] with non trivial degree distribution and an additional community 
structure or spatial structure. These networks ensembles can be cast in the wide category 
of Configuration or "hidden variable" models extensively study in the recent literature 
[21 [191 [2ni [211 [221 [23] . The percolation properties and the large deviations of the cluster 
distribution in these ensembles are studied in this paper by mapping the problem to a 
fully connected Potts model with heterogeneous couplings. We find results in agreement 
with reference [24] where the Potts model formulation was first used for the study of the 
percolation properties of complex networks with heterogeneous degrees. In particular 
our framework generalize the results of [21] and can be applied in network ensembles 
with very diverse structure, not only network ensembles with heterogeneous degree 
distribution, but also network ensembles with an additional non-trivial community or 
spatial structure. 

The paper is organized as follows. In section 2 we introduce the generalized 
canonical random ensembles. In section 3 we introduce the generating functions for 
the cluster distribution and we characterize its large deviations. In section 4 we 
relate the problem of finding the cluster distribution in generalized canonical model, 
and their percolation transition, to the study of a fully connected Potts model with 
heterogeneous couplings. In section 5 we solve the fully connected Potts model with 
heterogeneous couplings and we find the percolation threshold and critical exponent 
/3 for the generalized canonical network ensembles. In section 6 we find the cluster 
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distribution in the generalized canonical network ensembles. In section 7 we compare 
our theoretical predictions with simulation results. Finally in section 8 we give the 
conclusions. 



2. Random network ensembles 



In this section we introduce the generalized random ensembles described in pTl [18] . 
The generalized random ensembles are an extension of the known G{N, M) and G{N,p) 
random network ensembles and are related to Configuration and "hidden variable" 
ensembles [21 [HI [201 [HI [22l [23] . 

2.1. TheG{N,M) andG{N,p) random network ensembles 

The mathematical literature has widely studied the properties of the G{N, M) and 
G{N,p) random network ensembles. 

• A random network in the ensemble G{N, M) is a network having nodes and 
M undirected links. If we indicate with aij the adjacency matrix of the network 
(with aij = 1 if there is a link between node i and j and = otherwise), the 
probability that a network Q, associated to the adjacency matrix a, belongs to the 
G{N, M) ensemble is given by 

p{g) = h{M,Y.a^,) (1) 



with 




(2) 



and with the indicating the Kronecker delta. The probability of each link in 
this ensemble of networks is given hj p = M/ {N{N — l)/2). 

• A network in the G{N,p) ensemble is a network in which each possible pair of links 
is present with probability p. Therefore the probability of a specific network Q in 
this ensemble is equal to 

Pc{Q) = 1[p"H^-pY"''' (3) 

i<j 

where aij is the adjacency matrix. In the G{N,p) ensemble the total number of 
links M is not fixed but is Poisson distributed with mean (M) Pc{g) = pN {N — l)/2. 

The G{M,N) and the G{N,p) ensemble with p = M/{N{N - l)/2) are linked by a 
Legendre transform, and, in the asymptotic limit of A^ — > oo, they share the same 
statistical properties. 
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2.2. Generalized random network ensembles 

Recently a statistical mechanics approach has been proposed [171 IlH] that is able 
to generalize the random networks ensembles to network ensembles with much more 
complex structure including networks with highly heterogeneous degree sequences and 
non trivial community structure or spatial dependence of the link probability. The 
statistical mechanics approach is able to describe both "microcanonical" network 
ensembles (that satisfy hard structural constraints and generalize the G{N, M) random 
ensembles) and "canonical" network ensembles (that satisfy the structural constraints 
when their properties are averaged over the whole ensemble and generalize the G{N, p) 
ensemble) . 

• The "microcanonical" networks have to satisfy a series of hard constraints F{Q) = C 
and the probability of these networks are given by 

Pmc{G) = ^S{F{g) - C) (4) 

with Z indicating the cardinality of the ensemble. The probability of each link pij 
is computed introducing some Lagrange multipliers pTl 118] . 

• The "canonical" conjugated ensemble can be built starting from the probability 
of the links Pij in the "microcanonical" one. We assign to each network Q the 
probability 

which generalizes ([3]) to heterogeneous networks. In the "canonical" ensembles the 
structural constraints are satisfied on average 



F{Q) = C. (6) 

Here and in the following we always indicate by the average over the ensemble 

probability Pc{G) given by ([5]) and with (■ ■ ■) the average over all the nodes 
z = l,...,iV. 



In this paper we focus on generalized "canonical" networks. Each node i in this ensemble 
is characterized by two discrete hidden variables 6i and a,. We consider in this paper 
the link probability given by 

_ 9iejW{ai,aj) 

and is fully specified once the function (a, a') is given. The link probability ([7]) 
corresponds to maximally entropic ensembles with given degree structural constraints 

In the ensembles described by (^^, the degree of each node fc, is a Poisson variable 
with average 
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In the following we specifically comment on some relevant limiting cases for the 
general distribution ([7j). 

• The G{N,p) ensemble 

If the values of the hidden variables 9's are equal, i.e. 9i = 9 Vi and W{a, a') = 
Sa,a', the probability of a link is given by 

p,^=p = 9^(1 + 9') (9) 

The degree of each node is a Poisson variable with equal average k = pN. 
Performing also the average (■) over all the nodes of the network we get 

f = f (10) 

We recover therefore the Erdos and Renyi ensemble G{N,p) by taking 

9 = 1^^^MJ^ (11) 
\ll ~ {k)/N ^ ^ ^ ' 

where the last expression is valid for sparse networks with (k) finite. 

• The Configuration model 

If the linking probability pij of equation ([7]) depends only on 9i and 9j, (i.e. 
W{a,a') = 1 a, a'), then 

This ensemble is the canonical version of the Configuration model each node i 
having a degree ki distributed according to a Poisson variable with average 

This ensemble has in general non-trivial degree degree correlations that disappears 
for maxj(6'j) <^ 1. In this last case, maxj(^j) ^ 1 the linking probability Pij defined 
in equation f|T3|) can be approximated as 

p,j = 9,9,. (14) 

Therefore in this limit the networks of the ensemble are uncorrelated and there is 
a simple relation between the hidden variables 9i and the average degree fc, of the 
node i, i.e. 

9, = -j^. (15) 

Finally we observe that if we use (fT5|) the linking probability Pij can be expressed 
in the well known expression for uncorrelated networks 
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Structured networks 

In the more general structured case we have two possibihties: 

— i) The index a = 1, . . . ,A with A = 0{N^^'^) can indicate the community of 
a node and the function a') can he a A x A matrix. In this case the 

number of hnks L{a, a') between the community a and the community a' will 
be distributed according to a Poisson distribution with average 



L{a, a') = y^^pij6{m.m{ai, aj), a)6{ma.x{ai, aj), a'). (17) 

i<j 

— a) The index a can indicate a position in a metric space which determine the 
link probability. In this case the function 1^(0;, a') is a vector depending only 
on the metric distance i.e. W = W{d[a,a']). 

For structured networks with a generic distribution of 6''s and a non trivial function 
of W{a,a') we can consider the limit when the [maxj(6')]^[maxQ,^a/ Vr(a,a')] <^ 1. 
In this limit the linking probability pij given by equation ([7j) reduces to the simple 
form 

Pij = eiejW{ai,aj) (18) 

and we have 

e- = ^ ~ (19) 

with Xa = Ej OjW{a, aj). 



3. Large deviation of the cluster distribution 

The number of connected components C{Q) or "clusters" of a network Q gives direct 
information on the topological structure of the network and their percolating properties. 
Indeed if C(^)/A^ is small there are few large connected components while in the opposite 
case the network is divided into a huge number of small clusters. In the limit of large 
network sizes N ^ 00 each canonical generalized network ensemble will be characterized 
by a typical value of the number of clusters C*{N). The typical distribution of clusters 
gives the percolating properties of the networks belonging to the ensemble and will be 
able to characterize the critical exponents of the percolation phase transition. Moreover 
different network realizations ^ of a generalized canonical ensemble will have a number 
of clusters C{Q) which is subject to large deviations with respect to the typical value 
C*{N). 

Given the probability of a network Pc{Q) in the canonical generalized random 
ensembles, as defined in equation ([5]), we can define the probability density -P(C) of 
generating a random network Q in this ensemble with C clusters as in the following: 

P{c) = Y.Pc{Q)s{c,c{g)). (20) 

g 

In the thermodynamic limit, 00, the probability P{C) is centered at some typical 
value C* and decays extremely fast away from C* in the large networks limit. Let us 
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indicate with c = C/N the number of connected components per vertex, the typical 
value of this quantity converges in the thermodynamic limit to a size independent value 
c*. Therefore, in order to characterize P{C) in the thermodynamic limit, we consider 
the function uj{c) defined as 

u;{c)=hm ^lnP(C). (21) 

implying clearly u^c) < for all c G [0, 1] and u!{c*) = 0. 

Finally we introduce the generating function Y{q) of the cluster probability P{C) 

Y{q) = Y.nc)q'' = E i[ihri^-Prjy-^''Q^^^ ■ 

C e(a) ij 

where in the last expression we have used equation ([5]) defining the generalized random 
ensembles. We characterize the asymptotic limit of the cluster generating function Y{q) 
by the 0(g) defined as 

0(g) = lim ^lny(g). (23) 



From equation (l22l) we obtain, with a saddle point calculation, that the conjugated 
Legendre transform of the quantity uj{c) can be expressed in terms of 0(g) according to 
the relation 

uj{c) = min[0(g) — clogg] . (24) 

The cluster distribution is therefore fully characterized in the asymptotic limit if we 
know the function 0(g). 

4. The fully connected heterogeneous Potts Model and the Percolation 
transition of the generalized random networks ensembles 

In this section we will reduce the problem of finding the cluster distribution in 
generalized canonical random ensembles to the study of a mean-field Potts Models with 
heterogeneous couplings. We will prove that 0(g), given by (!23|) . has a formal relation 
with the free energy of the mean field Potts model with heterogeneous couplings, after a 
suitable analytic continuation. This relation generalizes the known connection between 
the fully connected Potts model and the generating function of the cluster distribution 
of a random G{N,p) network [lH [13]. 

In order to present the results of the paper in a self-contained way we describe 
here the cluster expansion of the fully connected Potts model. The Potts model is a 
well known statistical mechanical problem [25] describing N classical degrees of freedom 
(Tj associated to the nodes i = 1 ... N of a given network. Each variable cxj can take 
g different values, namely cTj = ... g — 1, and is coupled to all the other degrees of 
freedom aj by means of a two-body interaction of strength Jij. This interaction favors 
configurations where all the nodes in the network have the same value of a. Thus the 
energy reads 

= -E'^(^»'^i)'^y ~ h'^u^^5{a,ai) , (25) 

i<j c i 
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where we assume that all the couplings are positive, Jij > and that the first sum in 
f l25|) runs over all the pairs of nodes of a fully connected network. Moreover, we take 
the auxiliary field hua parallel to the direction a. The partition function of the model 
is 

Z= exp{-PE[{a}]) , (26) 

{(Ti=0,...,g-1} 

where f3 is the inverse temperature and the summation runs over all spin 
configurations. In order to map the Potts model to the cluster structure of the 
generalized random network ensembles, we expand the partition function Z following 
the article [H] 

Z[{v.,}, /^] = E n [1 + ^.)] e^'^i:.-!:.^^--) . (27) 

o- i<j 

where we have defined 



Vij = e 



m,) _ 1 . (28) 



Expanding equation (1271) we obtain 



(29) 



+ E VijVki5{ai, aj)6{ak, ai) + . . . 

i<j , k<l {ij)^(kl) 

Each term in the expansion (l29l) corresponds to a possible network Q formed by a 
subset E{Q) of edges on the complete network. Each contribution from a network Q 
is weighted by the probability I\ij(zE(g) Vij and the sum is made over all possible networks 
^ of nodes. Using this expansion, after performing the sum over the configurations 
{(Tj = 0, ... g — 1}, we can write the partition function reported in (1271) . in the form: 

C(G)-1 / \ 

z[{v.,},h]=Y. n n Ee^'^^^" ^ (30) 

G ijeE{g) n=0 \ cr I 

with -E(^) given by the set of all edges in ^, C(^) given by the number of connected 
components in the network and Sn denotes the size of the n-th component. From the 
previous equation it follows that in absence of external field 

z[K.},/i=o]=E n % (31) 

Q ijeE{g) 

By comparing the definition of the cluster generating function fl22|) and the expression 
( !3T1) for the partition function of the Potts Model, we observe that the two functions 
can be related by the following simple expression: 

YiQ) = n (1 - P^J) Z[{vij = P^,{l - P^,r'}, h = 0]. (32) 

i<j 

and the associated logarithmic function reads 

0(g) = EMI -P..) -/[{%}] (33) 

i<j 
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where Vij = Pij{l — Pij)~^ and / is defined at null external field h = 0. In the high 
temperature limit /5 ^ the couplings Jij given by fl28l) are linked to the edge probability 
by means of the equation ( 128|) so that 

Therefore in order to find the cluster generating function we can simply solve the fully 
connected Potts model with heterogeneous couplings. Any assumption on the network 
ensemble will have a direct counterpart on the structure of the couplings in the Potts 
model. 

We will solve the model in this framework, specializing the results for the cases of our 
interest ([7]). Using equation we obtain 

Vij = (3J{6i, Oj, ai, aj) = 6i6jW{ai, aj). (35) 

In the various different cases under study the function J{9i,9j,ai,aj) takes different 
values: 

• The G{N,p) ensemble 

For the characterization of the cluster distribution of a Poisson random network in 
the G{N,p) ensemble with p = {k)/N we take 

/?J(^„%,«„«,)^^ (36) 

for all pairs i,j. 

• The Configuration model 

For the characterization of the cluster distribution in the Configuration model we 
take 

/?J(^,,%,a,,«,) =0i% (37) 

In the case of an uncorrelated network we have <^ 1 and we can expre ss the 
hidden variables 6i in terms of the expected average degree ki, as 6i = ki/ \J (k)N 
Consequently the couplings of the Potts model take the form 

(3J{e.,e,,a.,a,) = ^. (38) 

• For the characterization of the cluster distribution in structured network ensemble 
with community structure or spatial dependence on the embedding geometric space, 
we have 

(3J{ei, Oj, au a,) = OiO^Wiau a,). (39) 

In the case in which (maxj ^^j)^[maxQ, a' W^(a, a')] ^ 1 the previous equation 
simplifies 

/?J(^„ dj, a„ a,) = fc.fc, ^"^''^'^ (40) 
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For g — s> 1 the properties of the partition function (IHT]) are in correspondence 
with the percolation properties [H] of the generahzed canonical network ensembles 
with linking probabilities pij given by (171). We will sketch the proof following |26j . 
It is straightforward that in the limit g — » 1, the partition function Z[{vij}, h] = 
Ui<j{^ -Pij)~^ so that 



In Z[{vij}, h] + E»<j ln(l - p^j) _ din Z[{vij},h] 



.(41) 

9=1 



N^oo q^i N{q - 1) A'^oo Ndq 

We could choose the parameter Ua- = 5(cr, 0), so that the external field favors the a = 
state, the partition function reported in (!30|) simplifies 

c(g)-i 

Z[{v^A,h] = Y. n n {q-l + e''-). (42) 

g ijeE(g) n=0 

Using the fact that Enf{Sn) = EsEn^iSn - S)f{S) = EsCiS)f{S), where S is the 
number of nodes in the same cluster and C{S) the number of clusters with S nodes, we 
obtain the previous equation becomes 

Z[{v.,},h] = j: n % e^sCisM^-^^^^n , (43) 
g ijeE{g) 

Performing the summation over the graphs with a saddle point approximation, we obtain 
in the thermodynamic limit the equation fHTl) is 

^1 s{g) s 

where s = S/N and c(s) = C{S)/N. Differentiating the previous equation with respect 
to the external field we obtain that the node probability to be in the percolating cluster 
is linked to the free energy function of the Potts model in the limit g — *• 1 

^hm 1 + 1^ = 1 - E^c(.) = P(te,}) . (45) 

While the second derivative gives the mean clusters per nodes. Using the Potts model, 
we are also able to compute the probability two given nodes belong to the percolating 
component. Let us introduce the node- node correlation in the limit h 

A,(g) = Ee""''"'^"^^.<^. = (^^.^.)' (46) 

that measures the probability two nodes have the same colour. We could easily compute 
this quantity and we obtain 
Q 

q 

where Cij is the indicator function: if node i and j are in the same cluster it has the 
value one, otherwise it vanishes. We want to underline the fact that the probability 
two nodes are in the same non-percolating component is defined through the following 
relation 

n^, = {C.,)-P'i{pkl}) (48) 

This shows how solving the Potts model in g — 1 limit, gives us information on the 
percolating transition in generalized network ensemble. 



lim-^A,(g) = l-(C.,) (47) 
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5. Free energy of the Potts model and the percolation phase transition 

In order to solve the mean-field Potts model we introduce the order parameters 

1 ^ 



ceaicr) = ^Y. '^(^' 0,)5{a, a,) . (49) 

where 



Ng^ = J25{e,e,)5{a,ai) (50) 

i 

are the number of nodes with a given hidden variables 9 and a. The order parameters 
C0a satisfy their proper normalization 

Y.ceo.{(y) = l. (51) 

cr 

The energy of the Potts model in absence of external field h = 0, expressed in terms of 
the order parameters Cga{cr), takes the form 

E[{ce«(a)}] = -— J2 PeaPe'a'CeaicT)ce'a'i(r)Jie,e',a,a')+OiN){52) 



aM,0' ,a,a' 



where we have explicitly shown the dependence of the coupling from external parameters 
6 and a. In order to express the partition function as a sum over the collective variables 
C9a{<^), we need to take into account the entropic contribution, counting the number of 
microscopic configuration with a given value of cga{<^)- To the leading order in N we 
get 

Z= n ( n mf'(a)}) = ^ (53) 

where the free energy density functional reads 

+ 'YpeaCea{(T)\ncga{(T) . (54) 



In the large limit one can evaluate the sum in (1531) by the saddle-point method. 
As a function of q, the Potts model undergoes a phase transition. For q < Qc the 
order parameter is invariant under the permutation of the spin values cr = 0,...,g — 1. 
Nevertheless above the percolation transition, for q > qc the ground state breaks the 
symmetry of the Hamiltonian. 

5.1. Symmetric saddle point 

The free energy of the Potts model is invariant under the permutation of the q colors. 
When this symmetry is also shared by the ground state, the fraction of nodes of a given 
color could be written as 

cea{<y) = - , (55) 
Q 
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which ensures different colors to be identical. Inserting this ansatz in equation fl54l) we 
get 

Pr^'^iq) = -^P E PeaPe'a'J{0,9',a,a') - \nq. (56) 

Computing the second order derivative of the free energy density functional, we can 
study the stability of the symmetric solution. When the eigenvalue of the Hessian 
Matrix of the free energy (15^ changes sign and becomes negative the ansatz (l55ll is no 
more correct. The Hessian matrix reads 

J^ee'aa'{o;T) = -— — = d[a,r) x 

dcea{(^)dce'a'{r) 



6{d, e')6{a, aO-Tr - PeaPe'a'N(3J{e, 6', a, a') 



(57) 



and the related eigenvalue problem is 

(Aea - Peaq)eea = -peaMe^ (58) 
where the quantity M^q, is defined as 

Mea = E Pe'a'NpJ{e, 9', a, a')ee'a' • (59) 

e'a' 

Inserting equation (!58|) into (!59|) . we find 

pl^,Nf3J{e,e',a,a') 

Mea = - 2^ ^ Mg'a' , (60) 

Q,a' ^S'a' — Pe'a'Q 

defining the eigenvalues of the Hessian matrix in ( 1571) . In order to obtain the critical 
values for the external parameters that cause instability in the free energy density, we 
have to find when eigenvalues change sign. Upon imposing Xga = we find this condition 
is 

Mea = E -Pe'a'NpJ{e, 9', a, a')Me'a' ■ (61) 

e'a' 1 

In the general case PJ{0, 9', a, a') = 99'W{a, a'), the stability condition can be expressed 

as 

g < gc = A (62) 
with A indicating the maximal eigenvalue of the matrix 

Ka,a' = NY,Pea'9'W{a,a'). (63) 

e 

In the following we study in detail the critical point Qc defined by fl62|) and fl63l) in 
few relevant cases of the generalized network ensembles. 

• The G{N, p) ensemble 

In the special case of the networks in the G{N, p) ensemble networks with a delta like 



distribution po = 6{9, y {k)/N), the critical point for percolation q = 1 provided by 
the expressions (1621) and fl63|) is the well known percolation condition for a random 
network k = (k) = 1 
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The Configuration model 

In the case of Configuration model the couplings factorize, (3J{6, 6', a, a') ~ 66'. 
The stability condition ( 162|) ( l63l) becomes 

Qc = N{6^) . (64) 



In the case in which the network is uncorrelated we have 6i = ki/ y {k)N and the 
degree ki of a node z is a Poisson variable with average fcj. The critical point fl62l) 
can be then expressed in terms of the actual degree of the canonical Configuration 
ensemble as 



In the typical case limit, i.e. q = 1, the previous equation corresponds to the 
condition for the percolation transition in Configuration networks [U [11], |15] . 

Structured networks 

In the general case of the structured networks the complete eigenvalue problem in 
equation (1^ and equation (1^ have to be solved on a case by case basis in order 
to find the percolation critical point. 

Nevertheless in the following we presents two simple cases in which the problem 
can be simplified. 

— First case 

We present a case in which a perturbative analysis can give good approximation 
to the critical point. The case under study is the case in which the network 
has a detailed structure made of A different communities labeled with an index 
a = 1, . . . , A and A ~ 0{1). Each community has well defined features such 
as the average degree and the number of links shared with other communities. 
This naturally leads to an interaction between nodes which depends on the 
community they belong to, encoded in the following matrix 

W{a,a') = \ Z 66 
[4 n a a 

In this hypothesis the matrix K fl63l) takes the form 

K^^, = NWia,a') Y.pea'6^ = Wiaa'){6^)a' • (67) 
e 

where we indicated with {x) js = J2e Pepx the average over one single component 
p. The eigenvalue problem fl62l) that we have to solve to find the critical point 
of the Potts model can be solved perturbatively in the limit A = ^ — ^ -C 1. 
In this case the matrix K is 

K = tlj{D + AH) (68) 

where D is a diagonal matrix and H has vanishing diagonal elements 

= =N{6^)J{a,a') 

u / if a = a' . . 

I N{6'^)a' n a ^ a 
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It is well known in perturbation theory for non degenerate states, that the 
eigenvalues of this problem show second order corrections to the diagonal 
entries Daa' in the parameter A. Finally we obtain that the onset of instability 
occurs when the following relation is satisfied 



Qc = N max lb 



.(70) 



This set of coupled equations reduce to the value found in the Configuration 
model, i.e. Qc = N{6'^) when there is only one single community. Here we 
report the condition for the leading term in 0(A°) that has the following form 

= ^iV max {6')^ . (71) 

a=l,...A 

We want to underline the new percolation condition becomes 

N max (6^)^ = ^ (72) 

a=l,...,A ih 

meaning that the percolation transition depends strongly on the number of 
links of the most connected community. 

Whenever different communities have the same distribution i.e. the same 
second moment {0'^)a = (^^), we are able to perform the calculation exactly 
and the critical value Qc reads 

q, = ii; + {A-l)^)N{9') (73) 

Second case- 

The second case that we consider is formed by sparse structured networks with 
the couplings pj{9,9',a,a') taking the expression ( HOj) that we write here for 
convenience 

/Sm, e,, a„ a,) = t^k.^lp^ (74) 

In the further approximation that the density of nodes with "hidden variables" 
9 and a are factorisable, i.e. poa = PePa we can simplify the eigenvalue problem 
(1621) . fl63|) to find the critical point of the Potts model as 



= {k{k - 1))A (75) 
where A is the maximal eigenvalue of the matrix K defined as 

5.2. Asymmetric saddle point 

Below the phase transition the symmetric solution fl55|) is no more stable, as shown 
in the previous section. In the stationary state of the Potts model a giant component 
appears, and a more complicated saddle point has to be found. Due to the fact that 
one single color becomes dominant, generalizing for similar ansatz made for the Potts 
model with homogeneous couplings [13], the following ansatz on the parameter cga is 
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proposed 

cea{cr, Sea) = ^ (1 + (g - 1)5^0) if (a = 0) 

C0a(cr, Sea) = - (1 - Sea) else (77) 

q 

And thus the density functional free energy reads 

l3f[{cea{(T,sea)}] = ^ PeaPe'a'PJ{0,9',a,a')[{l - q)seaSe'a' - I] + 

- logg + V — [(g - 1)(1 - Sea) log(l - sea) + 
ea (1 ^ 

+ (1 + (g - l)sea) log(l + (g - l)^^.)] , (78) 
where we have to minimize over the variational parameters sea- Solving the equation 

dpf[{cea{(r,sea)}] _q ^^^^ 



dst 



•a 



we finally obtain the self consistent condition for the parameter we solved numerically 

p^pa _ 1 

g — 1 + e"P" 

with Pa given by 

Pa>=NY, eW{a, a')paesea. (81) 

ae 

Therefore equation (ISUI) can be expressed as a close expression for pa, which is the order 
parameter for the Potts phase transition. In particular we find 

Pa'=NY^peaeW{a,a') 7—^. (82) 

The solution of this equation is Pq, = for q < qc and develops a non zero solution 
for q > qc. The transition can be continuous or discontinuous. In all the ensembles 
studied in this paper, qc = "2 signs the crossover between a second order phase transition 
and a first order one. This could be understood in the general framework of Landau 
Theory. As it is well known, the Hamiltonian of the Potts model is invariant under the 
permutation symmetry of the g colors, which in the case g < 2 is accidentally equivalent 
to the Z2 symmetry. In equation ( |78l) it is easy to show the free energy is explicitly even 
under the transformation sea ~sea when g < 2, while for higher g, the free energy 
density / contains all possible powers of the order parameter sea- As a consequence, 
within the Landau Theory, the property of the free energy for g < 2 necessary reflects 
into a continuous phase transition at least in absence of an external field. Thus on 
general ground we expect the crossover from second to first order transition could occur 
only at the value g = 2 independently on the network we choose. If we expand fl82l) to 
the first order in pa we get the equation 

Pa = - Ka,a'Pa' (83) 
Q a' 
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with the matrix K^^a' given by fl63p . We recover therefore the same critical point 
Qc = A where A is the maximal eigenvalue of the matrix K as was found by studying 
the stability of the Potts model above the phase transition. 

For the case of the percolation transition, i.e. g — >■ 1 [H] we have a continuous 
phase transition and we can study equation (l82l) for small values of pa to find the 
critical exponents of the percolation transition. 

• TheG{N,p) ensemble 

In this case the order parameter pa is independent on a pa = p and the self- 
consistent equation (!82|) simplify to 

pfp - 1 
q — 1 + e'^p 



with 6 = J {k)/N. The expansion for small value oi x = 9p and q = 1 gives 



^={k)[x-Y) (85) 
therefore we can derive the known result that 

p oc {(k) - if (86) 
with the mean field critical exponent /3 given by /3 = 1. 
The Configuration model 

In the case of the Configuration model the order parameter p^ is independent on 
a, i.e. Pa = P and the self-consistent equation ( !82l) reduces to 
„ Pee e"-l 

where p = Y,e /— PeSe- The expansion of this equation for small value of p provides 



the critical exponents for networks in the Configuration model and generalizes the 
results of uncorrelated networks to network with the correlations imposed by the 
Configuration model. In the case in which (6^) is finite, the expansion of (l82l) for 
q = I gives 

P = Nj:pee' (p-W 
e \ ^ 

The order parameter close to the percolation phase transition goes like 

p oc {N{e^) - if (89) 

with /3 = 1 as in the G{N,p) ensemble. On the contrary, in the case in which 
pe oc 9^'^ with 7 G (3, 4] the expansion of equation (IHTl) gives 

p = N{e^)p - p-'-^I (90) 

giving the critical exponent /3 = Finally we study the case in which pg = 9^'^ 
and 7 < 3. In this case the self-consistent equation (1871) can be written as 

P = p'-'i (91) 

giving p = with the critical exponent f3 = ^3:^. 
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• The structured networks 

In general the study of the percolation transition for structured networks might 
be considered on case by case basis depending on the pga distribution and on the 
function a') under consideration. Here we consider the case in which the 

distribution pga is factorisable, i.e. pga = PePa- In this case, if the moment (6'^) is 
finite, expanding equation fl82l) for g = 1 we get 

Pa = NY.peae^W{a,a') (pa'-^epl 

= Y.Ka,o.'Pa' - \Y.pgaW{a,a')e^pl (92) 

a' ^ da' 

If we write pa in terms of the eigenvectors of the matrix i.e. 

A 

the equation fl52]) can be written as an equation for the constants c\. Solving 
perturbatively assuming that ca ^ c\ for A 7^ A with A given by the maximal 
eigenvalue of the matrix we get 

CA cx (A - 1) 

Ca oc (A - If for A ^ A (94) 

Therefore in this case there are two critical exponent /3 = 1 for the maximal 
eigenvector and (3' = 2 for all the other eigenvectors. In the case pg oc d~'^' with 
7 G (3,4] we have the expansion 

Pa= Ka,a'Pa' " J^PoPc^^i^^ "0^(")pI~^ (95) 

a' ea' 

in this case we have 

caoc (A -1)1/(^-3) 

Ca oc (A - 1)2/(7-3) for A ^ A (96) 

and the critical exponent (3 = 1/(7 — 8) and j3' = 2/(7-8) which generalizes 
the results for scale-free networks. In the case in which pg oc O^'' and 7 < 8 the 
expansion of the equation ( IHTl) gives 

Pa = EPI"'^(«'«')^«'- (97) 

a' 

getting the critical exponents /3 = 1/(8 — 7) and = (4 — 7)/ (8 — 7). 

Using equation flHOl) and flSTl) we find an explicit expression of the free energy of the 
Potts model in the asymmetric phase ( ITHj) as a function of the order parameter vector 

Pa 

e a 



Pf{Q)= - 7^ J2 PeaPe'a'Nee'W{a,a') J^Pe^^P^ _ i , gp, 

gg'nr,' q g n q i + e° 



+ E 



Ftla 

e a 



Opae'''- 



log f g - 1 + e^P" 



(98) 
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Using this expression we find for the function 0(g) the exphcit expression as a function 
of Pa, 



l-q 



2q 



VeaVe'a'NOe'W{a,a') -YpeJp(i————Q 
ee'aa' ea 1 i + e 

Op^P'^ 



(99) 



log (g - 1 + e^^°) 



_g - 1 + e'^P- 

In the specific case of the Configuration model the precedent equations ( l98l) and ([99 
can be simplified to the following equation for the free energy density /(g) 



2g 



((i-g)/-i)+E^ 



q-l + e'^P 



log(g - 1 + e' 



and the following equation for the function 0(g). 
'^^^^ " 2g - 1) - 



g - 1 + e^/' 



- log(g -1 + e! 



(100) 



.(101) 



6. Cluster distribution 



We large deviations uj{c) of the clusters distribution can be calculated using ( l24l) . by 
performing a Legendre transformation of the function 0(g) calculated by evaluating 
expression (!99l) at the self-consistent solution of equation (l82l) . 

^^(c) = min (0(g) — clog g) . (102) 

Therefore we have shown that by solving the heterogeneous Potts model with couplings 
[3J{6, 9', a, a') = 99'W{a, a') we can directly characterize the critical point of the 
percolation phase transition, the critical exponent /3 of this transition and the large 
deviation function of the number of clusters c = C/N present in the networks of the 
ensemble. 



7. Numerical results 



In this section we present the study of the large deviation of cluster distribution for 
different examples of generalized network ensembles. 

• The G{N,p) ensemble 

We consider the simple case of a G{N,p) networks ensembles where the average 
degree of each node 9i is independent on i, i.e. 9i = \J {k)/N, i.e. p = {k)/N. The 
equation flM|) is the self-consistent equation for the Potts model phase transition. 
This equation has been already studied [I3] where it was found that the Potts 
model has phase transition in qc = {k) of second order for value of gc < 2 and of 
first order for gc > 2. We suggest the reader to refer to references [HI [13] for a full 
account of this case. 
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q q 



Figure 1. The two branches of the (j){q) function across the Potts model phase 
transition for the hidden variables ensembles with average degree distributed according 
to a Poissonian. The solid line indicate the function (j){q) calculated at the asymmetric 
solution of the Potts model and the dot-dashed line indicate the function <j){q) 
calculated for the symmetric solution. For k < 2 the free energy at the phase transition 
varies continuously, from the p = solution to the asymmetric solution. The inset 
show the difference of the free energy calculated on the two branches. In the left 
figures k = 1.5 while in the right k ~ 2. 




q q 



Figure 2. The two branches of the (j){q) function across the Potts model phase 
transition for the Configuration model with the average of the degrees at each node 
distributed according to a Poissonian. The solid line indicate the function 
calculated at the asymmetric solution of the Potts model and the dot-dashed line 
indicate the function (f>{q) calculated for the symmetric solution. The free energy 
at the transition varies discontinuously aA. qc > k ~ (k) + 1 when the metastable 
Configuration p =/= disappears. The inset shows explicitly the discontinuity in the 
difference between the function calculated on the two solutions. 

• The Configuration model 

In particular we study the limit of weak heterogeneity when we assume that the 
average degree of the nodes {k} is Poisson distributed and the case of strong 
heterogeneity of the degree of the nodes when the hidden variables {6i} are 
distributed according to a power-law. 

— Poisson hidden variable distribution 

We consider the case in which the distribution of the mean values for the 
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Figure 3. We show the numericaUy evaluated u!{c) function for the hidden variable 
Configuration model in which the average degrees are distributed according to a 
Poisson distribution with n = f + (fc). We show the dependence on the typical number 
c* as a function of the mean connectivity. The higher is k the smaller the number of 
expected typical components. 



connectivity's ki is Poisson and 9i = yki/N. This ensemble introduce a small 
heterogeneity with respect to pure Erdos and Renyi networks. In this case 
the critical point is equal to qc = N{6'^) = (k) + 1. Therefore the percolation 
transition happens at + l = 1 revealing that the percolating phase is already 
when the mean connectivity is (k) —>■ 0. 

As predicted by the theoretical results we found a phase transition for qc = 
N{6'^) = (k) + 1 depending on the mean value of connectivity (k). In figures 
[Hand [2] we show the function 0(g) as a function of q for different values of the 
parameter n = (k) + 1 that has the same role of the inverse temperature in 
the associated Potts model. The phase transition is of the second order when 
qc = {k) + 1 < 2 (See figure [1]). On the contrary, when (A;) + 1 > 2, (See figure 

ED. 

In figure ([3]) we show the function uj{c) for this ensemble for different values of 
K = 1 + (fc).We obtain that for value of c < c = e"^'^^ the function uj(c) tends 
to infinity, i.e u{c) — > —oo. This value is relative to the minimum number of 
components that are equivalent to the number of isolated vertices. Therefore 
we plot the function a;(c) only for c > c. As a function of the parameter (k) 
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Figure 4. The solution of tlic self-consistent equation ([87|) for the parameter p as 
a function of the number of colors q for a scale-free random networks with different 
values of the critical point Qc = k — {k{k ~ l)){k) and power-law exponent 7 = 5. 




0,2 0,4 0,6 0,8 I 1,2 1,4 0,4 0,6 0,8 1 1,2 1,4 1,6 1,8 2 



q q 

Figure 5. The two branches of the (f){q) function for a Configuration model with in 
power-law distribute {0}'s with exponent "f — 5. The solid line is the function (j){q) 
calculated at the asymmetric solution of the Potts model and the dot-dashed line is 
the function (f){q) calculated for the symmetric solution. The transition point is at 
qc — K — N{0'^) < 2. In the inset we show the difference in free energy in the two 
branches which is continuous. 

we found that different number of connected components are dominant and in 
particular for (fc) -|- 1 > 2 we find that the typical number of cluster are lesser 
that 0.5 and that in the limit of high {k) this number vanishes (See figure [3]). 
— Scale-free degree distribution 

We use as degree distribution pe ~ 9^^ with 7 > 3 so that the second moment 
does not vanish. We fixed the exponents 7, letting the value of infrared 
cutoff changes in order to fix all the moments (^™). In figure H] we show 
the behavior of the solution p of the self-consistent equation (IHTll as a function 
of the parameter q for scale-free networks. When q < Qc defined by equation 
(l62i) . i.e. Qc = {k{k — 1)) / {k) a non zero solution p 7^ is found while for q > Qc 
the p = solution becomes the stable one. The value of the stable solution 
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p of equation (1571) close to the phase transition q varies continuously 

in the case qc and discontinuously for q > 2. There is a second order 



phase transition for values of 



(fc(fc-i)) 
(fc) 



< 2 and a first order phase transition 



for ^^^^^^-'^ > 2 where the free energy discontinuously goes to zero as shown in 
figure [5l 

In figure [5] and [6] we plot the difference of the 0(g) functions calculated on the 
two branches of the solution of (|87|) (the solution with p = and the other non 
trivial solution stable for q < qc)- From these figures we can see that A0(g) 



has a discontinuity in the regime 



regime 



(Hk-l)) 

(k) 



> 2 and vanishes continuously in the 



(fc(fc-i)) 



< 2. In figure [7| is shown the probability of large deviation in 




Figure 6. The two branches of the p{q) function for value of parameter k — N{9'^) > 2 
for scale free network 7 = 5. In the inset we show the difference between the free energy 
associated to the symmetric solution p = and the free energy associated to the other 
asymmetric solution with p ^ 0. In the figure we report the difference of the free 
energy calculated on the two branch solution showing evidence for the discontinuity in 
the free energy at the transition point. 



the number of clusters for the Configuration model with power-law distributed 
value of {Oys. The typical number of clusters c* is a decreasing function of 

K= {k{k-l))/(k). 

— Comparison of the typical number of cluster for Configuration model with 
Poisson and power-law distributed {9} 's. 

The typical value of the number of clusters c* depends on the parameter N{6'^). 
This dependencies is shown in figure [HI In the case of power-law distributed 
{Oys, the characteristic scale above which the number c* vanishes is higher 
than in Poisson case. 

• Simple case of structure network: Network with four equal communities 

In general the case of structure networks with non trivial (a, a') functions have 
to be studied on a case by case approach. 

— First case- Here we consider the particular example in which the network is 
divided in four equivalent communities. This networks have been considered 
as a benchmark networks with community structure [27] . We consider in 
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c c 

Figure 7. The logarithm of the probabihty density function uj{c) versus the number 
of connected components c. It is easy to identify the typical value c* when the function 
u! touches the zero axis. This number depends strongly on k = N{9'^). Each network 
correspond to different choice of the parameter 7 = 7, 5, 3.5, 3.01 starting from the top 
on the left. 




Figure 8. The typical number of clusters per vertex c* vanishes exponentially fast 
with the increasing of k — N(6'^). Here we show in a logarithmic scale this relation for 
Poissonian degree distribution and power-law networks for value a value of 7 = 5. We 
report also the value from the best fitting calculation that has a good agreement with 
the data points. 



particular the case with N6i = const Vi and the network divided in four 
communities a = 1, 2, 3, 4, i.e. A = 4 with 



W{a,a') = \ (fi,) ^ , , (103) 
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q q 

Figure 9. In figure we plot ip{q) on tlic two branclies of tlie solution for network with 
four communities (second case under study). The network ensemble is characterized by 
a W{a, a') given by (|103p with x = 0.5 and different values oi 9, 9 = 110 and 9 = 150. 
This network show the same crossover between a first order and a second order phase 
transition as soon as the parameter 6 > dc calculated in equation 11051 ^'c ~ 120. The 
solid line indicate the function (j){q) calculated at the asymmetric solution of the Potts 
model and the dot-dashed line indicate the function (f){q) calculated for the symmetric 
solution. This is more clear by the discontinuity in the difference of (j){q) calculated in 
two branches of the solution reported on the inset. The solid line corresponds to p = 
while the dashed one is (j){q) calculated on the non trivial solution p ^ 0- 




Figure 10. We report the logarithm of the cluster probability uj{c) for the study 
network ensemble with four communities (second case under study). The network 
ensemble is characterized by a W{a,a') given by (|103p with x = 0.5 and the two 
different values of 6*, 6* = 110 and 9 ~ 150. The distribution is dominated by the 
typical number c* that depends strongly on the average connectivity of the network 
ensembles. 



In this case we find that for q > 1/64 the only stable solution is the zero one 
while for lower value of the parameter q there is a non zero solution that is 
stable everywhere independently on the value of x. We want to emphasize the 
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fact that, substituting equation fll03p in (1751) . we obtain the critical relation 
reads 

gc = ^, (104) 

showing there is no phase transition for every value of the parameter x at fixed 
6 and x. 

Second case- We consider the case with four communities a = 1,2,3,4, i.e. 
A = 4 with W{a, a') given by fllOSp at fixed x for the value of the hidden 
variables 6i = 6 Vi G A. The behaviour in that case is exactly the same as in 
the Configuration model. There is a crossover between first and second order 
phase transition governed by the parameter 6. The threshold value is given by 
the relation 

= 2(^ + {A- (105) 

and substituting the value of the parameter x = 0.5 and A = 4, the critical 
value is of order 6c ~ 120. The free energy is shown in figure [9] where it is 
easily to catch the nature of the phase transition. For completeness we report 
also the logarithm of the clusters distribution in figure [10] that show the same 
behaviour of Configuration model. 



8. Conclusions 



In conclusion, we have studied the percolation transition and the large deviation of the 
cluster distribution in generalized canonical random network ensembles. The calculation 
has been performed by mapping the problem of finding the cluster distribution on a fully 
connected Potts model with heterogeneous couplings. The results generalize the known 
results for uncorrelated configuration models to correlated configuration models and 
are able to predict the behavior of the phase transition for generic structured networks 
with non-trivial community or spatial structure. Ongoing work will investigate what is 
the role of the percolation properties in generalized random network ensembles for the 
understanding of the critical behavior of dynamical models defined on them and for the 
characterization of their small loops distribution. 
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